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Abstract — Recent coding strategies for deter- 
ministic and noisy relay networks are related to 
the pipelining of block Markov encoding. For de- 
terministic networks, it is shown that pipelined 
encoding improves encoding delay, as opposed 
to end-to-end delay. For noisy networks, it is 
observed that decode-and-forward exhibits good 
rate scaling when the signal-to-noise ratio (SNR) 
increases. 

I. Introduction 
Consider a network represented by a graph Q = (V, £ ) 
where V is a set of vertices (or nodes) and £ is a set 
of directed edges. There are M messages W m , m = 
1,2,..., M, and every message is associated with one of 
the nodes. As described in [TJ Ch. 3], with every node u 
we further associate one channel input X u and one chan- 
nel output Y u . The output Y v is a (generally noisy) func- 
tion of the channel inputs X u of those nodes u having 
directed edges (u, v) G £. A central clock governs the 
operation of the network [2] . The clock ticks n times and 
node u is permitted to transmit symbol X u after clock 

tick i — 1 and before clock tick i, i — 1,2, ... ,n. The sym- 

(i) 

bol Yu appears at clock tick i. The network is also causal 
in the sense that X^ is a function of messages at node u 
and the past outputs Y*' 1 = F u (1) , y„ (2) , . . . , Y^~ l) . This 
graphical model was considered in [3J, for example, where 
edge-cut bounds were developed. 

Suppose there is one message only. The paper [4] de- 
velops achievable rates by using a compress-and-forward 
(CF) strategy. Suppose further that the channels are de- 
terministic, which means that Y v is a function of {X u : 
(u,v) G £}. The paper [5] develops interesting achiev- 
able rates. A simpler version of this problem with broad- 
casting and without interference was considered in [51 [5] 
where capacity theorems were discovered. An even more 
basic model was considered in [7] where there is no broad- 
casting and no interference. Broadcast erasure and finite- 
field networks are considered in [8j [9], [101 [HI 03 ■ 

One goal of this document is to revisit and clarify the 
coding methodology and analysis of [H [S] . A second goal 
is to point out relations to block-Markov coding and de- 
coding methods [HI [Jj2 HU . A third goal is to state the 
fact that decode-and-forward (DF) exhibits good signal- 
to-noise ratio (SNR) scaling because it removes interfer- 
ence [TBI . 



II. Cuts and Bounds 

Consider a set S of nodes and let R be the rate of the 
message. Let A be the collection of all cuts (S, S c ) that 
separate s from one of the destinations, where iS c is the 
complement of S in V. A standard cut-set bound (see [T51 
Ch. 14] or [T^l Sec. 10.2]) specifies that reliable commu- 
nication requires 

R< max min I(X S ; Y S c \X S c) (1) 

Px 1 x 2 .-.x lvl (-) (S,S°)£A 

where A"^ = {X u : u G S} and similarly for Y$c. 

A simplification of |T]) is achieved by defining the two 
boundaries of S as 



P 1 (S) = {u:(u,v)e(S,S c )} 
P 2 (S) = {v:(u,v)e(S,S c )}. 



(2) 
(3) 



Let A — B = {a : a £ A,a ^ B} and observe that Xg 



have 



-/3 2 (S)) 



forms a Markov chain. We thus 



I(X s ;Y S c\X S c) 



I(X s ;Y /32 ( S )Y S c_ l32 ( S )\Xsc) 
I(X s ;Y 02 ^ S )\Xs<=). 



For deterministic networks, ([¥]) becomes 

I(Xs;Yp 3 (s)\Xsc) = H(Y/3 2 ( S )\Xs' 



(4) 



(5) 



For deterministic networks with broadcasting but no in- 
terference, or Aref networks, the channel output of node v 
is a vector Y v = [Y UjV : (u, v) G £] where Y u>v = f UtV (X u ) 
for some function f u ,v(')- The point is that node v ex- 
periences no interference, a situation encountered if the 
transmitters use frequency or time-division multiplexing 
(FDM/TDM). We simplify the expression © as follows: 



H(Y 02{S) \X S .) < H(Y MS) ) 

H{Y U<02{S) ) 



E 



(6) 



where the two inequalities hold with equality if the X u , 
u G V, arc statistically independent. It turns out 
that independent X u are best for Aref networks (see [H 
Lemma 1]). 

Summarizing, the cut-set bound is 



R < max 

Px, X 



min Value(5,5 c ) (7) 
|v| (-) (s,s-)eA 



2/1,2 = h,2(xi) 




Figure 1: Example of a deterministic relay network with 
no interference. 



where the value of the cut (S, S c ) is 

Valuc(S, S c ) 

I(Xs;Yfc( S )\Xs°) 



H(Yp 2 ( S )\X S c) 



y 



e/3i(S) 



H(Y U 



MS)) 



in general 
for deterministic networks 
for Aref networks 

(8) 



and for Aref networks the optimization over joint input 
distributions results in a product distribution. 

For example, consider the Aref network in Fig. [1] and 
S = {1,2,3,7} so that Y S a = {Y 4 ,Y 5 ,Y 6 } where Y 4 = 
[Y 2 a,Y 3A ], Y 5 = y 4 . 5 , and Y 6 = [F 2 , 6 ,Y 5 , 6 ]. We have 
/3 1 (5) = {2,3},/3 2 (5) = {4,6},and 



Valuc(5,5 c ) = H(Y 2A Y 2fi ) + H(Y 3 , 4 



(9) 



Observe that we must consider the joint entropy of F 2 ,4 
and Y%fi, and separately the marginal entropy of 13,4. 
This separation occurs because the inputs X u , «eV, are 
statistically independent. 

III. Multicast Coding 

Aref Networks 

We begin with Aref networks and consider acyclic di- 
rected graphs. Suppose we use every edge (u, v) exactly 
n times by activating the nodes in topological order. For 
example, in Fig. [1] we activate node 1 for n clock ticks, 
then we activate node 2 for n clock ticks, then node 3, 
and so forth. Observe that every node buffers its received 
symbols so that every transmit vector is a function of one 
message sent by the source node. We thus pipeline trans- 
mission to achieve a continuous transmission rate that is 
the same as the individual-activation rate. This block 
structure was also used in [7] and it reminds us of the 
block Markov coding structure of [12j except that pipelin- 
ing requires no Markov dependencies. We shall return to 
this issue below when we consider interference. 



We continue with our achievability proof, which is the 
same as in [2] with minor differences. The proof in [2], in 
turn, follows the steps of Sec. V.A] with the main dif- 
ference being the use of typical sequences. The reason for 
repeating the proof here is to later point out subtle issues 
for deterministic and noisy networks. We use the same 
typical sequence sets Tg(Px u ) and Tg(Py u ) as in [2]. Let 

/?(•) = [#(•) :i=l, 2,..-, n]. 

Codebooks. Choose Px x (•)> Px 2 (-), ■■■ , Px< v < (•) an d 
suppose that the message is at node s = 1. At node 1, 
choose /"(•) to map each of the indices in {1,2,..., 2 nR } 
to a sequence x ± drawn uniformly from T$ {Px-i ) ■ At node 
u, u 7^ 1, choose /"(•) to map each sequence in Tg(Py u ) 
to a sequence drawn uniformly from T s n (Px u )- Note that 
we have y_ u G T?(PyJ for all u since x u € T£{P Xu ) 
(see [2 Lemma 4]). 

Encoding. Node u = 1 transmits x^w) = /"(w). 
Node u, a / 1, transmits x u (w) = /" (y u (w)j. Note 
that we have labeled y u with the message w. This makes 
sense for deterministic networks because w is mapped to 
unique x v and y v once the code books are chosen. 

Decoding. Destination node t puts out 



(y t W) 



error if y (w') = y (w) for w' 7^ it; 
otherwise. 



w 



(10) 



Analysis. We say that node u can distinguish between 
w and w' if 



y„H ^ vS w ')- 



(11) 



Let S(w,w') be the set of nodes that can distinguish w 
and w' and observe that the event S(w, w') = S is simply 
the event 

{Y s (w)^Y 3 ( W ')}n{Y 3e (w)=Y 3c (w')}. (12) 

We may as well consider s G S(w,w'). An error 
occurs at destination node t if t ^ S(w,w'), i.e., if 
(S(w, w'), S c (w, if')) is a cut between nodes s and t. Let 
A t be the set of such cuts, i.e., we define A t = {S c V : 
s G S,t G S c }. 

Let P e (t, w, w') be the average probability that node t 
cannot distinguish between w and w' , where the average 
is over the ensemble of encoding functions. We can write 



P e (t,w,w') = Pr 



|J {S(w,w')=S} 

seA t 

^2 Pt[S(w,w')=S] . 



(13) 



seA t 



Using (H2), we can further writ43 
Pr [S(w, w') = S] 

<Ft[Y 3c {w)=Y 3c {w , )\Y^{w)^Y^{w 1 )} 

= Pi' [y ms) M = Y 02(s) (w')\y^(w) Ys(w')] (14) 

= Pr [Y Mshl32{s) (w)=Y MshMS) (w') 

Yp l{S )(w) + X_ Pi ( 5 ) K) 
= II p r[Yu,(3 2 ( S )M=Y uMS) (w') 

Y u (w)^Y u {w')} (15) 

where the last step follows because the pairs 
(2L u ( w )t2£.u(. w '))j u e are statistically inde- 

pendent if {Hs(w) ^ Hs(w')} occurs, and because there 
is no interference. 

We proceed to bound the probability in (fT5|) . We have 



u,02{sy 



(«/)) G 1?(P XuY ) by Lemma 4]. 



The event (jT2J) thus implies 

(x u (w'),Y uMS) (w)) eT?(P Xu Y uWS) ). (16) 

But note that X u (w') is independent of X. u (u>), and hence 
Y uMS) (w), when conditioned on {F„H ^ Y u {w')}. 



The probability of (fl6|) occurring is thus 

y„, ft(s) N)|/ IW*JI ■ (17) 

We use [21 Lemma 2] and [21 Lemma 3] to bound 

\T?(P X J\ > (1 - e«(n)) • 2"( 1 - 5 )^^) (18) 

(19) 

where £5(71) — > as n — > 00. The remaining steps are the 
same as in [2] and we will not repeat them here. We find 
that the average error probability can be made small if n 
is large and 



R< min Valuc(5,5 c 

(S,S C )6A 



(20) 



Finally, we optimize over all input distributions. The re- 
sult is that we can make the overall rate approach the 
right-hand side of while at the same time ensuring 
reliable communication. The multicast capacity of Aref 
networks with cycles can be similarly achieved by con- 
structing a time-parameterized acyclic graph as described 
in [17l p. 146] or [7], for example. 



x Note that [2] should have included {Y s (w) ^ Y_ s (w')} in the 
conditioning of its equation (14), since the inclusion of this set is 
required for the conditional statistical independence of the X (w) 
across u and to. The text in [2] is corrected by including {Xs^) 7^ 
Y_ s (w')} in the conditioning in (14) and (19); the remaining steps 
are the same as in [2]. 



Layered Deterministic Networks 

Relay coding for networks with interference was consid- 
ered in several recent papers [T5J, [THl [201 OH HI] ) ■ However, 
at the moment the problem seems too difficult to solve 
even for networks with 4 nodes (the 3-node problem was 
solved in [22] )■ Instead, the authors of [5] developed an 
achievable rate where the channel inputs X u , u G V, are 
independent. Two motivations for doing this are (1) the 
theory is simplified and (2) independent inputs will give 
the proper capacity scaling with SNR since beamforming 
will not provide scaling gains (see [1]). 

The coding methodology of [5] uses the same random 
coding and mapping at the relays as above. Furthermore, 
for so-called layered networks, the encoding at the source 
is also the same as in [2] because pipelining can be used. 
The difference to [2] lies in the analysis that we now out- 
line with slight modifications. 

To begin, we add a technical step and restrict atten- 
tion to messages w for which x v (w) € T$ l (Px v )- This 
step hardly reduces the rate since the code words are cho- 
sen independently via the product distribution Px v ■ We 
continue to use the definition that node u can distinguish 
between w and w' if (jTTJ) is true. Since the network is 
deterministic, every node knows xy(w) and xy(w') and 
so, given y (w), node u can check whether 



(x v (w'),y(w)) £T?(P XV Y U ). 



(21) 



Let S(w,w') be the set of nodes that can distinguish be- 
tween w and w' , i.e., ([2"Tj) docs not occur. We note two 
interesting facts for deterministic networks: 

• the marginal typicality (|21|) over u G /?2(<S) implies 
the joint typicality 



(x v (w'),y p2(s) (w)) G r 5 \P XvY ^ S) ) 



(22) 



• the typicality (22]) implies y (w) = (v/) 
and therefore x S a(w') = x S c(w). 

Both of the above facts are simple consequences of the 
definition of typical sequences (see [21 Lemma 4]). We 
thus have the following result. 

Lemma 1. Suppose that X_ v (w) £ T 7 s l {Px v ) for all w. 
The event S(w,w') — S in (|12[) then implies the event 

(^K)-I ft (5)H.^W) € T?(P Xs Y MS) X s c) 

(23) 

where Xs(w') is independent of X\>(w)Y\;(w). 
Lemma [T] and similar steps as ([T7)) - ([T9^) give 
Pt[S(w,w') =S] < 2- n V(Xs;Y 02(s) x S c)-35H(x v) ] 

= 2 -n[H(Y g2is) \X s c)-35H(X v )] ^A) 

Continuing as for Aref networks, we find that R satisfying 
([2"0")) is achievable, where Value (S, S c ) is defined in ([5]) and 
where the X u , u G V, are independent. 




Figure 2: Example of an acyclic deterministic network. 

Acyclic Deterministic Networks 

Consider next acyclic networks. We interpret the cod- 
ing described in [5j Sec. VI] as follows. Let L be the length 
of the longest path from the source node to any destina- 
tion node. Transmission is divided into B + L—l length- n 
blocks of symbols, where B is a large integer, and in every 
block a different random code is chosen for every node. 
The random codes for the source node have 2 nBR code 
words for every block. In block b, b = 1, 2, . . . , B + L — 1, 
the source node maps the long message w with nBR bits 
to the codewords of the 6th code. 

For example, consider the network in Fig. [2] where 
nodes 1 and 4 are the message and destination nodes, 
respectively. We have L = 3 and the encoding for B = 3 
is depicted in Table Q] We have labeled every code 
word x_u^ of node u in block b with the channel output 
y^ b ~ 1 ^ and message of which it is a function. After the 
B + L — 1 = 5 transmission blocks are completed, decod- 
ing can proceed by using one's favorite (ML, typicality, 
etc.) decoding method over all blocks of outputs. We re- 
mark that this method might be considered a special type 
of block Markov coding method [HI [13l [14] with Markov 
dependencies across all blocks. 

We wish to understand if one can improve the end-to- 
end (encoding and decoding) delay. Suppose we use the 
same pipelined encoding method as for Aref networks or 
layered networks. In other words, we split the message 
w into B blocks wi, W2, ■ ■ ■ , wb each having nR bits. In 
block 6, the source encoder maps the message Wb to its 
codeword x^\wb). The relays operate as before. How- 
ever, note that the relay nodes experience interference, 
i.e., every relay node's transmission is affected by several 
messages. As before, transmission is done using B + L—l 
length-n blocks of symbols, and in every block a new ran- 
dom code is chosen for every node. 

For example, consider again the network in Fig.O Sup- 
pose we use the "natural" encoding depicted in Table [2] 
where we have labeled every code word Xu with the chan- 
nel output y^~ lS> and the messages that affect them. The 
destination could wait until all B + L — 1 = 5 blocks are 
received and then perform a joint decoding of all mes- 
sages. As a result, we recover the rate of the strategy in 
Table [1] but with a smaller encoding delay and complex- 
ity. This might be important, for instance, if wi must 



be encoded before the messages W2 and W3 arrive at the 
source node. Alternatively, we could use backward decod- 
ing with a sliding window of length two. For example, by 
considering its outputs from blocks b = 4, 5 the destina- 
tion can decode W3 with the desired mutual information 
of /(A2X3; I4), and similarly for W2 and w\. 

On the other hand, although the encoding delay is re- 
duced as compared to Table [I] the maximum end-to-end 
delay has not changed. Moreover, node 4 cannot use a 
forward sliding window decoder to reduce the maximum 
delay. For instance, consider w\ which one can hope to 
decode after block 6 = 3. However, the interference from 
w 2 hi (wi,W2) prevents the method from working as 
desired. We have also tried other encoding methods but 
have so far failed to reduce the end-to-end delay for gen- 
eral acyclic deterministic networks. 

IV. SNR Scaling 
A specialized SNR scaling result was developed for 
noisy networks in [4]. The model in this paper speci- 
fies a channel gain a 6 " " for edge (u,v), where 6 Mj „ is a 
positive integer. The parameter a is then made large. In- 
stead, suppose that the channel inputs X u are complex 
numbers and the channel outputs are 

Yv = Z v + S ' y 1 Qu.v X u (25) 

where g u>v is a real and positive gain coefficient, and Z v is 
complex Gaussian noise with independent real and imag- 
inary parts each having variance N/2. The Z v , v £ V, 
are independent and we add the constraint E[|A tl | 2 ] < P 
for all u. 

Consider the network graph. The cut-set bound J7]) is 
positive only if there is a Steiner tree rooted at the source 
node with leaves at every destination node that has non- 
zero gains along every edge of the tree. We use DF with 
block Markov encoding and sliding window decoding [131 
I14j along this tree, with common-message broadcasting at 
forks in the tree (recall that we have full-duplex nodes). 
This DF strategy effectively removes interference [13l [14] 
and can achieve at least the rate 

R = log(l + g min P/N) (26) 

where g m m = rnin UiV g u ,v On the other hand, the cut 
bound ([7]) for the cut S = s is at least as restrictive as 

R < log(l + g max (\V\ - l)P/N) (27) 

where g m ax = rnax U)l) g u>v and the factor |V| — 1 assumes 
that X s can be received by all other nodes. Hence, we 
find that at high SNR DF achieves within 

log 2 (l + g max (\V\ - l)P/N) - log 2 (l + g mm P/N) 
«log a (f^(|V|-l)) (28) 

bits of the capacity. The above result generalizes to multi- 
antenna nodes as well. 



Tabic 1: A coding strategy for the network of Fig. [2] for B = 3. 
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Table 2: A pipelining strategy for the network of Fig. [2] for B = 3. 
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